# Web Adaptation

Qwen3 1.7B ONNX
Qwen3-1.7B is a 1.7B-parameter open-source large language model released by Alibaba Cloud, based on the Transformer architecture, supporting various natural language processing tasks.
Large Language Model Transformers
Q
onnx-community
189
1
Whisper Large V3 Turbo
An ONNX-optimized Whisper large speech recognition model designed for web deployment
Speech Recognition Transformers
W
onnx-community
2,988
54
Timesformer Base Finetuned K600
TimeSformer is a video understanding model based on the Transformer architecture, specifically designed for video classification tasks.
Video Processing Transformers
T
onnx-community
16
0
Depth Anything V2 Base
Depth-Anything-V2-Base is an ONNX-format depth estimation model adapted for Transformers.js, designed for image depth estimation on the web.
3D Vision Transformers
D
onnx-community
56
0
Whisper Base.en
Whisper is a general-purpose speech recognition model trained by OpenAI. This model is based on large-scale weakly supervised training and supports speech transcription in multiple languages.
Speech Recognition Transformers
W
onnx-community
76
1
Musicgen Small
MusicGen Small is a Transformer-based music generation model capable of producing high-quality music clips from text descriptions.
Audio Generation Transformers
M
Xenova
5,434
24
Yolov9 C All
Gpl-3.0
Object detection model based on YOLOv9, adapted for Transformers.js, capable of running in a browser
Object Detection Transformers
Y
Xenova
176
2
Depth Anything Large Hf
ONNX version of depth estimation model based on Transformers.js, suitable for web applications
3D Vision Transformers
D
Xenova
19
3
Hubert Base Superb Ks
A voice command recognition model based on the HuBERT architecture, optimized for keyword spotting tasks
Audio Classification Transformers
H
Xenova
17
1
Dpt Hybrid Midas
Hybrid depth estimation model developed by Intel, combining the advantages of convolutional neural networks and Transformer architecture
3D Vision Transformers
D
Xenova
23
0
Nougat Base
Nougat is a vision-based academic document understanding model capable of converting scientific PDF images into Markdown-formatted text.
Image-to-Text Transformers
N
Xenova
24
3
Trocr Base Printed
TrOCR is a Transformer-based OCR model specifically designed for recognizing printed text.
Text Recognition Transformers
T
Xenova
40
0
Trocr Small Printed
TrOCR-small-printed is a compact optical character recognition (OCR) model specifically designed for printed text recognition.
Text Recognition Transformers
T
Xenova
79
3
Distilbart Cnn 12 6
DistilBART-CNN-12-6 is a distilled version of the BART model, optimized for text summarization tasks, with a smaller size while maintaining high performance.
Text Generation Transformers
D
Xenova
218
0
Yolos Base
YOLOS is an object detection model based on the Transformer architecture, designed for efficient visual task processing.
Object Detection Transformers
Y
Xenova
16
0
Yolos Small
YOLOS-small is a small object detection model based on the Transformer architecture, designed for efficient visual tasks.
Object Detection Transformers
Y
Xenova
63
0
E5 Small V2
E5-small-v2 is an efficient text embedding model suitable for various natural language processing tasks.
Text Embedding Transformers
E
Supabase
35
2
Mms Lid 4017
MMS-LID-4017 is a speech recognition model supporting 4017 languages, developed by Facebook, focusing on language identification tasks.
Text Classification Transformers
M
Xenova
15
1
Mms Lid 126
MMS-LID-126 is a multilingual speech recognition model released by Facebook, supporting recognition of 126 languages.
Text Classification Transformers
M
Xenova
14
0
Ast Finetuned Speech Commands V2
A voice command recognition model based on AST architecture, optimized for web deployment in ONNX format
Audio Classification Transformers
A
Xenova
15
0
Ast Finetuned Audioset 10 10 0.4593
Audio Spectrogram Transformer (AST) model fine-tuned on the AudioSet dataset for audio classification tasks
Audio Classification Transformers
A
Xenova
82
0
Whisper Medium
Whisper Medium is a medium-scale speech recognition model developed by OpenAI, supporting automatic speech recognition (ASR) tasks in multiple languages.
Speech Recognition Transformers
W
Xenova
871
4
Detr Resnet 101
End-to-end object detection model based on Transformer architecture with ResNet-101 feature extractor
Object Detection Transformers
D
Xenova
216
2
Bart Large Cnn
A large-scale text summarization model based on the BART architecture, optimized for the CNN/DailyMail dataset
Text Generation Transformers
B
Xenova
173
8
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase